258 research outputs found

    Knowledge representation issues in control knowledge learning

    Get PDF
    Seventeenth International Conference on Machine Learning. Stanford, CA, USA, 29 June-2 July, 2000Knowledge representation is a key issue for any machine learning task. There have already been many comparative studies about knowledge representation with respect to machine learning in classication tasks. However, apart from some work done on reinforcement learning techniques in relation to state representation, very few studies have concentrated on the eect of knowledge representation for machine learning applied to problem solving, and more specically, to planning. In this paper, we present an experimental comparative study of the eect of changing the input representation of planning domain knowledge on control knowledge learning. We show results in two classical domains using three dierent machine learning systems, that have previously shown their eectiveness on learning planning control knowledge: a pure ebl mechanism, a combination of ebl and induction (hamlet), and a Genetic Programming based system (evock).Publicad

    Learning to solve planning problems efficiently by means of genetic programming

    Get PDF
    Declarative problem solving, such as planning, poses interesting challenges for Genetic Programming (GP). There have been recent attempts to apply GP to planning that fit two approaches: (a) using GP to search in plan space or (b) to evolve a planner. In this article, we propose to evolve only the heuristics to make a particular planner more efficient. This approach is more feasible than (b) because it does not have to build a planner from scratch but can take advantage of already existing planning systems. It is also more efficient than (a) because once the heuristics have been evolved, they can be used to solve a whole class of different planning problems in a planning domain, instead of running GP for every new planning problem. Empirical results show that our approach (EVOCK) is able to evolve heuristics in two planning domains (the blocks world and the logistics domain) that improve PRODIGY4.0 performance. Additionally, we experiment with a new genetic operator - Instance-Based Crossover - that is able to use traces of the base planner as raw genetic material to be injected into the evolving population.Publicad

    Error Analysis and Correction for Weighted A*'s Suboptimality (Extended Version)

    Full text link
    Weighted A* (wA*) is a widely used algorithm for rapidly, but suboptimally, solving planning and search problems. The cost of the solution it produces is guaranteed to be at most W times the optimal solution cost, where W is the weight wA* uses in prioritizing open nodes. W is therefore a suboptimality bound for the solution produced by wA*. There is broad consensus that this bound is not very accurate, that the actual suboptimality of wA*'s solution is often much less than W times optimal. However, there is very little published evidence supporting that view, and no existing explanation of why W is a poor bound. This paper fills in these gaps in the literature. We begin with a large-scale experiment demonstrating that, across a wide variety of domains and heuristics for those domains, W is indeed very often far from the true suboptimality of wA*'s solution. We then analytically identify the potential sources of error. Finally, we present a practical method for correcting for two of these sources of error and experimentally show that the correction frequently eliminates much of the error.Comment: Published as a short paper in the 12th Annual Symposium on Combinatorial Search, SoCS 201

    Two steps reinforcement learning

    Get PDF
    When applying reinforcement learning in domains with very large or continuous state spaces, the experience obtained by the learning agent in the interaction with the environment must be generalized. The generalization methods are usually based on the approximation of the value functions used to compute the action policy and tackled in two different ways. On the one hand by using an approximation of the value functions based on a supervized learning method. On the other hand, by discretizing the environment to use a tabular representation of the value functions. In this work, we propose an algorithm that uses both approaches to use the benefits of both mechanisms, allowing a higher performance. The approach is based on two learning phases. In the first one, a learner is used as a supervized function approximator, but using a machine learning technique which also outputs a state space discretization of the environment, such as nearest prototype classifiers or decision trees do. In the second learning phase, the space discretization computed in the first phase is used to obtain a tabular representation of the value function computed in the previous phase, allowing a tuning of such value function approximation. Experiments in different domains show that executing both learning phases improves the results obtained executing only the first one. The results take into account the resources used and the performance of the learned behavior.This research was partially conducted while the firs author was visiting Carnegie Mellon University from the Universidad Carlos III de Madrid, supported by a generous grant from the Spanish Ministry of Education and Fulbright. Both authors were partially sponsored by the Spanish MEC project TIN2005-08945-C06-05 and regional CAM-UC3M project number CCG06-UC3M/TIC-0831.Publicad

    ABC2 an agenda based multi-agent model for robots control and cooperation

    Get PDF
    This paper presents a model for the control of autonomous robots that allows cooperation among them. The control structure is based on a general purpose multi-agent architecture using a hybrid approach made up by two levels. One level is composed of reactive skills capable of achieving simple actions by their own. The other one uses an agenda used as an opportunistic planning mechanism to compound, activate and coordinate the basic skills. This agenda handles actions both from the internal goals of the robot or from other robots. This two level approach allows the integration of real-time response of reactive systems needed for robot low-level behavior, with a classical high level planning component that permits a goal oriented behavior. The paper describes the architecture itself, and its use in three different domains, including real robots, as well as the issues arising from its adaptation to the RoboCup simulator domai
    • …
    corecore